Using speech and gesture to explore user states in multimodal dialogue systems

نویسندگان

Rui Ping Shi

Johann Adelhardt

Viktor Zeißler

Anton Batliner

Carmen Frank

Elmar Nöth

Heinrich Niemann

چکیده

Modern dialogue systems should interpret the users’ behavior and mind in the same way as human beings do. That means in a multimodal manner, where communication is not limited to verbal utterances, as is the case for most state-of-the-art dialogue systems, several modalities are involved, e.g., speech, gesture, and facial expression. The design of a dialogue system must adapt its concept to multimodal interaction and all these different modalities have to be combined in the dialogue system. This paper describes the recognition of a users internal state of mind using a prosody classifier based on artificial neural networks combined with a discrete Hidden Markov Model (HMM) for gesture analysis. Our experiments show that both input modalities can be used to identify the users internal state. We show that an improvement of up to 70 % can be achieved when fusing both modalities.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Referring to Objects with Spoken and Haptic Modalities (Draft version)

The gesture input modality considered in multimodal dialogue systems is mainly reduced to pointing or manipulating actions. With an approach based on the spontaneous character of the communication, the treatment of such actions involves many processes. Without any constraints, the user may use gesture in association with speech, and may exploit the visual context peculiarities, guiding his arti...

متن کامل

Coordination of referring expressions in multimodal human-computer dialogue

This study examines coordination of referring expressions in multimodal human-computer dialogue, i.e. to what extent users’ choices of referring expressions are affected by the referring expressions that the system is designed to use. An experiment was conducted, using a semi-automatic multimodal dialogue system for apartment seeking. The user and the system could refer to areas and apartments ...

متن کامل

A 3D Gesture Recognition System for Multimodal Dialog Systems

We present a framework for integrating dynamic gestures as a new input modality into arbitrary applications. The framework allows training new gestures and recognizing them as user input with the help of machine learning algorithms. The precision of the gesture recognition is evaluated with special attention to the elderly. We show how this functionality is implemented into our dialogue system ...

متن کامل

Multimodal Dialogue for Ambient Intelligence and Smart Environments

Ambient Intelligence (AmI) and Smart Environments (SmE) are based on three foundations: ubiquitous computing, ubiquitous communication and intelligent adaptive interfaces [41]. This type of systems consists of a series of interconnected computing and sensing devices which surround the user pervasively in his environment and are invisible to him, providing a service that is dynamically adapted t...

متن کامل